Add new compactor metrics for tenant-prefixed object store#322
Add new compactor metrics for tenant-prefixed object store#322willh-db wants to merge 1 commit intodatabricks:db_mainfrom
Conversation
| runWebServer(g, ctx, logger, cancel, reg, &conf, component, tracer, progressRegistry, globalBaseMetaFetcher, api, srv) | ||
|
|
||
| for _, tenantPrefix := range tenantPrefixes { | ||
| compactMetrics.tenantAssigned.WithLabelValues(tenantPrefix).Set(1) |
There was a problem hiding this comment.
nit: remind me what is tenantPrefixes?
There was a problem hiding this comment.
it was v1/raw/<tenant> but now it's just <tenant>
| Help: "Total number of compaction iterations completed successfully per tenant.", | ||
| }, []string{"tenant"}) | ||
| m.tenantAssigned = promauto.With(reg).NewGaugeVec(prometheus.GaugeOpts{ | ||
| Name: "thanos_compact_tenant_assigned", |
There was a problem hiding this comment.
we also have this metric thanos_blocks_meta_assigned for tenant view (which tenant got assigned to which compactor` could we reuse that?
There was a problem hiding this comment.
blocks_meta is more like "how many blocks seen per tenant" whereas tenant_iterations is "how many times has compaction run per tenant." I think this will be valuable for ensuring liveness and being able to alert on compaction stalls.
tenant_assigned is useful to check on startup but in steady-state it is redundant with the other two. Let me know if you think it's worth keeping
jnyi
left a comment
There was a problem hiding this comment.
might reuse existing metric thanos_blocks_meta_assigned
Changes
thanos_compact_tenant_assigned{tenant}gauge to expose which tenants are assigned to each compactor instance, enabling verification of tenant partitioning across replicasthanos_compact_tenant_iterations_total{tenant}counter to track successful compaction iterations per tenant, enabling verification that compaction is completing end-to-end for every tenantThe existing compactor metrics are either global (
thanos_compact_iterations_total) or only carry a resolution label (thanos_compact_group_compaction_runs_*). In multitenant mode, there is no way to confirm via metrics that:These two new metrics close that gap with minimal overhead.